Search CORE

arXiv.org e-Print Archive

Routes for breaching and protecting genetic privacy

Author: A Acquisti
A Cavoukian
A Kong
A Machanavajjhala
A Narayanan
AD Johnson
AJ Pakstis
AK Manning
AL McGuire
Arvind Narayanan
B Fons
B Malin
B Malin
BA Malin
BM Henn
C Dwork
C Shannon
CD Huff
D Clayton
D He
D Zubakov
DJ Solve
DR Nyholt
DW Craig
EA Zerhouni
EE Schadt
EM Ramos
F Liu
G Church
H Lango Allen
H Li
HK Im
HS Venter
J Burn
J Gitschier
J Kaiser
J Kaye
J Kaye
J Lee
J Marchini
JE Lunshof
JH Park
JM Oliver
JP Roberts
K Benitez
K El Emam
K El Emam
K Silventoinen
KA Tryka
KB Jacobs
KS Kendler
L Kamm
L Sweeney
L Sweeney
LA Sweeney
LA Sweeney
LAP Kohn
LL Rodriguez
M Canim
M Gymrek
M Gymrek
M Kantarcioglu
M Kayser
MD Mailman
N Chatterjee
N Homer
NN Taleb
P Bohannon
P Kwok
P Ohm
P Paillier
PM Visscher
R Braun
R Drmanac
R Khan
R Noumeir
RL Bennett
S Byers
S McClure
S Sankararaman
S Walsh
SE Brenner
SF Terry
SH Friend
T Lumley
TE King
TE King
V Bafna
W Fu
W Hartzog
WG Hill
WW Lowrance
XL Ou
Yaniv Erlich
Z Lin
Publication venue
Publication date: 01/12/2013
Field of study

We are entering the era of ubiquitous genetic information for research, clinical care, and personal curiosity. Sharing these datasets is vital for rapid progress in understanding the genetic basis of human diseases. However, one growing concern is the ability to protect the genetic privacy of the data originators. Here, we technically map threats to genetic privacy and discuss potential mitigation strategies for privacy-preserving dissemination of genetic data.Comment: Draft for comment

Princeton University Open Access Repository

arXiv.org e-Print Archive

A Minimum Column Density of 1 g cm^-2 for Massive Star Formation

Author: A Parravano
A-K Jappsen
CD Matzner
CF McKee
CF McKee
Christopher F. McKee
CL Martin
DA Neufeld
ED Feigelson
EM Huff
F Motte
G Chabrier
IA Bonnell
IA Bonnell
J Braine
J Wu
JC Tan
JC Weingartner
KE Mueller
KE Young
Mark R. Krumholz
MH Heyer
MR Krumholz
MR Krumholz
MR Krumholz
MR Krumholz
R Plume
RB Larson
S Boissier
S Chakrabarti
YL Shirley
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/01/2008
Field of study

Massive stars are very rare, but their extreme luminosities make them both the only type of young star we can observe in distant galaxies and the dominant energy sources in the universe today. They form rarely because efficient radiative cooling keeps most star-forming gas clouds close to isothermal as they collapse, and this favors fragmentation into stars <~1 Msun. Heating of a cloud by accreting low-mass stars within it can prevent fragmentation and allow formation of massive stars, but what properties a cloud must have to form massive stars, and thus where massive stars form in a galaxy, has not yet been determined. Here we show that only clouds with column densities >~ 1 g cm^-2 can avoid fragmentation and form massive stars. This threshold, and the environmental variation of the stellar initial mass function (IMF) that it implies, naturally explain the characteristic column densities of massive star clusters and the difference between the radial profiles of Halpha and UV emission in galactic disks. The existence of a threshold also implies that there should be detectable variations in the IMF with environment within the Galaxy and in the characteristic column densities of massive star clusters between galaxies, and that star formation rates in some galactic environments may have been systematically underestimated.Comment: Accepted for publication in Nature; Nature manuscript style; main text: 14 pages, 3 figures; supplementary text: 8 pages, 1 figur

Cryptic Distant Relatives Are Common in Both Isolated and Cosmopolitan Genetic Samples

Author: A Albrechtsen
A Auton
A Gusev
A Kitchen
A Kong
A Price
B Derrida
B McEvoy
BL Browning
BM Henn
Brenna M. Henn
C O'Dushlaine
CD Huff
CR Gignoux
D Behar
D Rohde
FS Alkuraya
G Atzmon
G Leibon
G Malecot
Henry Harpending
I Moltke
Itsik Pe'er
J Li
J Novembre
J. Michael Macpherson
JL Mountain
JM Macpherson
Joanna L. Mountain
L Scott
L Weiss
Lawrence Hon
M Epstein
M Kirin
M Nalls
M Slatkin
M Zlojutro
N Rosenberg
N Rosenberg
N Rosenberg
Nick Eriksson
R McQuillan
RR Hudson
S Browning
S Ramachandran
S Tishkoff
S Wang
Serge Saxonov
SR Browning
W Bodmer
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Although a few hundred single nucleotide polymorphisms (SNPs) suffice to infer close familial relationships, high density genome-wide SNP data make possible the inference of more distant relationships such as 2nd to 9th cousinships. In order to characterize the relationship between genetic similarity and degree of kinship given a timeframe of 100–300 years, we analyzed the sharing of DNA inferred to be identical by descent (IBD) in a subset of individuals from the 23andMe customer database (n = 22,757) and from the Human Genome Diversity Panel (HGDP-CEPH, n = 952). With data from 121 populations, we show that the average amount of DNA shared IBD in most ethnolinguistically-defined populations, for example Native American groups, Finns and Ashkenazi Jews, differs from continentally-defined populations by several orders of magnitude. Via extensive pedigree-based simulations, we determined bounds for predicted degrees of relationship given the amount of genomic IBD sharing in both endogamous and ‘unrelated’ population samples. Using these bounds as a guide, we detected tens of thousands of 2nd to 9th degree cousin pairs within a heterogenous set of 5,000 Europeans. The ubiquity of distant relatives, detected via IBD segments, in both ethnolinguistic populations and in large ‘unrelated’ populations samples has important implications for genetic genealogy, forensics and genotype/phenotype mapping studies

CiteSeerX

eScholarship - University of California

Chapman University Digital Commons

Evolutionary Dynamics of Co-Segregating Gene Clusters Associated with Complex Diseases

Author: A Franke
A Keinan
AD Cutter
AD Johnson
AD Johnson
B Charlesworth
B Khor
BF Voight
C Pal
CB Foster
CD Huff
Christoph Preuss
David Wiedmann
DE Reich
Dirk Steinke
E Santiago
F Cheng
F Friedrichs
FA Reed
FA Reed
G McVicker
JC Barrett
JD Rioux
JM Akey
JP Hugot
JV Raelson
K Yamazaki
KE Lohmueller
KE Lohmueller
KM Wegner
L Southam
LA Hindorff
LB Barreiro
M Joron
M Krzystek-Korpacka
M Sémon
M-X Tang
Mona Riemenschneider
Monika Stoll
N Soranzo
P Flicek
PA Hohenlohe
PF O’Reilly
S Chun
S Durinck
S Myers
SB Gabriel
T Shiina
TIH Consortium
V Guryev
WG Hill
Y Ogura
Publication venue: Public Library of Science
Publication date: 14/05/2012
Field of study

BACKGROUND: The distribution of human disease-associated mutations is not random across the human genome. Despite the fact that natural selection continually removes disease-associated mutations, an enrichment of these variants can be observed in regions of low recombination. There are a number of mechanisms by which such a clustering could occur, including genetic perturbations or demographic effects within different populations. Recent genome-wide association studies (GWAS) suggest that single nucleotide polymorphisms (SNPs) associated with complex disease traits are not randomly distributed throughout the genome, but tend to cluster in regions of low recombination. PRINCIPAL FINDINGS: Here we investigated whether deleterious mutations have accumulated in regions of low recombination due to the impact of recent positive selection and genetic hitchhiking. Using publicly available data on common complex diseases and population demography, we observed an enrichment of hitchhiked disease associations in conserved gene clusters subject to selection pressure. Evolutionary analysis revealed that these conserved gene clusters arose by multiple concerted rearrangements events across the vertebrate lineage. We observed distinct clustering of disease-associated SNPs in evolutionary rearranged regions of low recombination and high gene density, which harbor genes involved in immunity, that is, the interleukin cluster on 5q31 or RhoA on 3p21. CONCLUSIONS: Our results suggest that multiple lineage specific rearrangements led to a physical clustering of functionally related and linked genes exhibiting an enrichment of susceptibility loci for complex traits. This implies that besides recent evolutionary adaptations other evolutionary dynamics have played a role in the formation of linked gene clusters associated with complex disease traits

Susceptibility of Anopheles stephensi to Plasmodium gallinaceum: A Trait of the Mosquito, the Parasite, and the Environment

Author: AA Escalante
AA James
BR Laurence
BW Alto
BW Alto
CB Beard
CD Ramsdale
CG Huff
D Ebert
D Fontenille
D Fontenille
D Fontenille
D Nace
DA Boakye
DJ Gubler
DS Falconer
E Brumpt
FH Collins
FH Collins
G Dimopoulos
H Hurd
Howard Hamilton
I Dia
I Ljungstrom
JC Hogg
JC Hogg
JC Koella
JCC Hume
Jen C. C. Hume
JJ Lemasson
JS Jones
KD Vernick
Kevin L. Lee
L Lambrechts
L Lambrechts
L Lambrechts
L Lambrechts
LZ Garamszegi
M Lynch
M Shahabuddin
PW Price
R Poulin
RC Collins
RE Ricklefs
RE Ricklefs
RH Hunt
Robert C. Fleischer
S Gandon
T Lehmann
Tovi Lehmann
WJ Niles
WJ Tabachnick
WJ Tabachnick
WJ Tabachnick
WM Liu
Y Alavi
Publication venue: Public Library of Science
Publication date: 09/06/2011
Field of study

Vector susceptibility to Plasmodium infection is treated primarily as a vector trait, although it is a composite trait expressing the joint occurrence of the parasite and the vector with genetic contributions of both. A comprehensive approach to assess the specific contribution of genetic and environmental variation on "vector susceptibility" is lacking. Here we developed and implemented a simple scheme to assess the specific contributions of the vector, the parasite, and the environment to "vector susceptibility." To the best of our knowledge this is the first study that employs such an approach.We conducted selection experiments on the vector (while holding the parasite "constant") and on the parasite (while holding the vector "constant") to estimate the genetic contributions of the mosquito and the parasite to the susceptibility of Anopheles stephensi to Plasmodium gallinaceum. We separately estimated the realized heritability of (i) susceptibility to parasite infection by the mosquito vector and (ii) parasite compatibility (transmissibility) with the vector while controlling the other. The heritabilities of vector and the parasite were higher for the prevalence, i.e., fraction of infected mosquitoes, than the corresponding heritabilities of parasite load, i.e., the number of oocysts per mosquito.The vector's genetics (heritability) comprised 67% of "vector susceptibility" measured by the prevalence of mosquitoes infected with P. gallinaceum oocysts, whereas the specific contribution of parasite genetics (heritability) to this trait was only 5%. Our parasite source might possess minimal genetic diversity, which could explain its low heritability (and the high value of the vector). Notably, the environment contributed 28%. These estimates are relevant only to the particular system under study, but this experimental design could be useful for other parasite-host systems. The prospects and limitations of the genetic manipulation of vector populations to render the vector resistant to the parasite are better considered on the basis of this framework

Automation of a problem list using natural language processing

Author: AR Aronson
AR Aronson
AR Aronson
AT McCray
AT McCray
C Friedman
C Friedman
C Friedman
C Friedman
C Friedman
C Friedman
CA Knirsch
CA Sneiderman
CD Manning
D Zingmond
DL Ranum
E Bayegan
E Chi
G Hripcsak
G Hripcsak
G Paterson
G Shadow
GF Cooper
H Bludau
H Goldberg
H Goldberg
H Wasserman
H Xu
HJ Scherpbier
Institute of Medicine (U.S.)
International Organization for Standardization
J Nivre
J Starmer
J Zelingher
JC Reichert
JEF Friedl
JR Campbell
JR Campbell
JS Elkins
JW Hales
K Heitmann
K Thompson
L Christensen
LL Weed
LL Weed
LT Kohn
LW Wright
M Fiszman
M Fiszman
M Fiszman
M Weeber
ML Muller
MS Donaldson
MS Tuttle
N Sager
NL Jain
P Haug
P Nadkerni
P Spyns
Peter J Haug
PF Brennan
PG Mutalik
PJ Haug
PJ Haug
PJ Haug
PL Elkin
Q Zou
RH Dolin
S Meystre
SB Koehler
SC Kleene
SJ Wang
SM Huff
Stephane Meystre
T Payne
TC Rindflesch
TC Rindflesch
W Pratt
W Pratt
WW Chapman
Y Huang
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained. METHODS: For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular). We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP) to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list. RESULTS: The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients), but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences. CONCLUSION: The global aim of our project is to automate the process of creating and maintaining a problem list for hospitalized patients and thereby help to guarantee the timeliness, accuracy and completeness of this information

Springer - Publisher Connector

Evidence for Hitchhiking of Deleterious Mutations within the Human Genome

Deleterious mutations present a significant obstacle to adaptive evolution. Deleterious mutations can inhibit the spread of linked adaptive mutations through a population; conversely, adaptive substitutions can increase the frequency of linked deleterious mutations and even result in their fixation. To assess the impact of adaptive mutations on linked deleterious mutations, we examined the distribution of deleterious and neutral amino acid polymorphism in the human genome. Within genomic regions that show evidence of recent hitchhiking, we find fewer neutral but a similar number of deleterious SNPs compared to other genomic regions. The higher ratio of deleterious to neutral SNPs is consistent with simulated hitchhiking events and implies that positive selection eliminates some deleterious alleles and increases the frequency of others. The distribution of disease-associated alleles is also altered in hitchhiking regions. Disease alleles within hitchhiking regions have been associated with auto-immune disorders, metabolic diseases, cancers, and mental disorders. Our results suggest that positive selection has had a significant impact on deleterious polymorphism and may be partly responsible for the high frequency of certain human disease alleles

Patterns of Ancestry, Signatures of Natural Selection, and Genetic Association with Stature in Western African Pygmies

Author: A La Batide-Alanore
A Leonhardt
AB Migliano
AL Price
AL Price
Alain Froment
AR Boyko
B Pasaniuc
Bart Ferwerda
BF Voight
BS Weir
C Ballard
C Batini
CA Winkler
CC Khor
CD Huff
Charla Lambert
D Lopez Herraez
D Philipson
D Redelman
DL Rimoin
E Patin
G Baumann
G Destro-Bisol
GA McVean
Gabriel Hoffman
GH Perry
H Eleftherohorinou
H Innan
H Lango Allen
H Tang
H Yasukawa
HJ Bandelt
HM Kang
J Chen
J Kamath
Jason Mezey
JD Storey
JE Pool
Jean-Marie Bodo
JK Pickrell
JK Pritchard
JM Akey
JM Kidd
Joseph P. Jarvis
Joshua M. Akey
JZ Li
K Bryc
K Tang
L Quintana-Murci
Larsson Omberg
Laura B. Scheinfeldt
LG Moore
LJ Young
M Bozzola
M Joron
M Pelican
M Stephens
M Stephens
M Stephens
MB Lanktree
MD Shriver
MG de Silva
N Davila
NS Becker
P Librado
P Moorjani
P Scheet
P Verdu
PA Fujita
PC Sabeti
PR Dormitzer
R Chakraborty
R Kimura
S Jain
S Ludwig
S Purcell
S Sankararaman
SA Miller
SA Tishkoff
Sameer Soi
Sarah A. Tishkoff
SH Williamson
SJ Kang
ST Sherry
TJ Merimee
TJ Merimee
TJ Merimee
TJ Pemberton
William Beggs
WS Alexander
Y Benjamini
Y Chen
Y Hattori
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

African Pygmy groups show a distinctive pattern of phenotypic variation, including short stature, which is thought to reflect past adaptation to a tropical environment. Here, we analyze Illumina 1M SNP array data in three Western Pygmy populations from Cameroon and three neighboring Bantu-speaking agricultural populations with whom they have admixed. We infer genome-wide ancestry, scan for signals of positive selection, and perform targeted genetic association with measured height variation. We identify multiple regions throughout the genome that may have played a role in adaptive evolution, many of which contain loci with roles in growth hormone, insulin, and insulin-like growth factor signaling pathways, as well as immunity and neuroendocrine signaling involved in reproduction and metabolism. The most striking results are found on chromosome 3, which harbors a cluster of selection and association signals between approximately 45 and 60 Mb. This region also includes the positional candidate genes DOCK3, which is known to be associated with height variation in Europeans, and CISH, a negative regulator of cytokine signaling known to inhibit growth hormone-stimulated STAT5 signaling. Finally, pathway analysis for genes near the strongest signals of association with height indicates enrichment for loci involved in insulin and insulin-like growth factor signaling

Horizon / Pleins textes